Add gpt-5.4-codex to resolve_model_config.py#2376
Add gpt-5.4-codex to resolve_model_config.py#2376juanmichelini wants to merge 4 commits intomainfrom
Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
API breakage checks (Griffe)Result: Passed |
Agent server REST API breakage checks (OpenAPI)Result: Failed Log excerpt (first 1000 characters) |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean addition following existing pattern.
This change is technically sound and follows the established pattern perfectly. However, since this modifies evaluation harness code (.github/run-eval/resolve_model_config.py), it should be reviewed by a human maintainer to confirm no unintended impact on benchmark results before merging.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean addition following existing pattern.
This change is technically sound and follows the established pattern perfectly. However, since this modifies evaluation harness code (.github/run-eval/resolve_model_config.py), it should be reviewed by a human maintainer to confirm no unintended impact on benchmark results before merging.
|
@OpenHands update PR to also add a test and add it to heuristics like this PR does #2374 |
|
@OpenHands update PR #2376 to also add a test and add it to heuristics like this PR does #2374 |
|
I'm on it! juanmichelini can track my progress at all-hands.dev |
- Add gpt-5.4-codex to model variant patterns in model_prompt_spec.py - Add test_gpt_5_4_codex_config() test for configuration validation - Add gpt-5.4-codex to reasoning effort test cases Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI've successfully updated PR #2376 to add tests and heuristics for Changes Made (commit
|
| File | Change |
|---|---|
openhands-sdk/openhands/sdk/llm/utils/model_prompt_spec.py |
Added gpt-5.4-codex to the GPT-5 codex variant patterns |
tests/github_workflows/test_resolve_model_config.py |
Added test_gpt_5_4_codex_config() test function |
tests/sdk/llm/test_model_features.py |
Added gpt-5.4-codex to reasoning effort test cases |
Checklist:
- ✅ Added test for configuration validation (like PR Add gpt-5.4 to resolve_model_config.py #2374)
- ✅ Added to heuristics in model_prompt_spec.py (like PR Add gpt-5.4 to resolve_model_config.py #2374)
- ✅ Added test case for reasoning effort support
- ✅ All pre-commit checks pass
- ✅ All relevant tests pass
- ✅ Changes pushed to remote branch
- ✅ PR description updated
Note: gpt-5.4-codex doesn't need to be explicitly added to REASONING_EFFORT_MODELS in model_features.py because "gpt-5" is already in that list and uses substring matching (which covers all gpt-5.x variants including codex). The test case I added confirms this works correctly.
PR link: #2376
🧪 Integration Tests ResultsOverall Success Rate: 0.0% 📊 Summary
📋 Detailed Resultslitellm_proxy_gpt_5.4_codex
Skipped Tests:
Failed Tests:
|
Summary
Adds the
gpt-5.4-codexmodel to resolve_model_config.py with corresponding tests and heuristics.Changes
Configuration
Integration Test Results
Tests will run in CI.
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:bca110a-pythonRun
All tags pushed for this build
About Multi-Architecture Support
bca110a-python) is a multi-arch manifest supporting both amd64 and arm64bca110a-python-amd64) are also available if needed